Logistic Regression Method for Class Imbalance Problem
GUO Hua-Ping1, DONG Ya-Dong2, WU Chang-An1, FAN Ming2
1.College of Computer and Information Technology, Xinyang Normal University, Xinyang 414000 2.School of Infomation Engineering, Zhengzhou University, Zhengzhou 450052
Abstract:As one of the most important classification models in pattern recognition and machine learning, logistic regression(LR) is an interpretable model and has good generalization ability. In this paper, LR model is applied to class imbalance problem, and a method, named LR for class imbalance (LRCI), is proposed to tackle data imbalance problem. To take a full consideration of data imbalance, two objective functions g-mean based metric (FBM) and f-measure based metric(GBM) are constructed respectively to supervise LRCI learning model parameters. And then, the model is effectively quaranteed high accuracy and recall rate. The experimental results on UCI datasets show that LRCI significantly boosts the performance on recall, g-mean and f-measure in the premise of high accuracy of LRCI. Besides, LRCI presents significant advantage comparing to other state-of-the-art class imbalance learning model.